Search CORE

Deep Blue Documents at the University of Michigan

IPAD: Stable Interpretable Forecasting with Knockoffs Inference

Author: Billingsley P.
Bonferroni C. E.
Fan Y.
Holm S.
Jinchi Lv
Mahrad Sharifvaghefi
Vizcarra A. B.
Wooldridge J. M.
Yingying Fan
Yoshimasa Uematsu
Publication venue: 東北大学大学院経済学研究科
Publication date: 06/09/2018
Field of study

Interpretability and stability are two important features that are desired in many contemporary big data applications arising in economics and finance. While the former is enjoyed to some extent by many existing forecasting approaches, the latter in the sense of controlling the fraction of wrongly discovered features which can enhance greatly the interpretability is still largely underdeveloped in the econometric settings. To this end, in this paper we exploit the general framework of model-X knockoffs introduced recently in Cand\`{e}s, Fan, Janson and Lv (2018), which is nonconventional for reproducible large-scale inference in that the framework is completely free of the use of p-values for significance testing, and suggest a new method of intertwined probabilistic factors decoupling (IPAD) for stable interpretable forecasting with knockoffs inference in high-dimensional models. The recipe of the method is constructing the knockoff variables by assuming a latent factor model that is exploited widely in economics and finance for the association structure of covariates. Our method and work are distinct from the existing literature in that we estimate the covariate distribution from data instead of assuming that it is known when constructing the knockoff variables, our procedure does not require any sample splitting, we provide theoretical justifications on the asymptotic false discovery rate control, and the theory for the power analysis is also established. Several simulation examples and the real data analysis further demonstrate that the newly suggested method has appealing finite-sample performance with desired interpretability and stability compared to some popularly used forecasting methods

arXiv.org e-Print Archive

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Gravitational waves from Scorpius X-1: A comparison of search methods and prospects for detection with advanced detectors

Author: A. Baykal
A. Melatos
A. R. King
C. Messenger
C. E. Bonferroni
C. E. Bonferroni
D. K. Galloway
E. Goetz
E. H. Thrane
G. D. Meadors
H. J. Bulten
J. T. Whelan
K. Jahoda
K. Riles
L. Sammut
P. D. Lasky
R. J. G. Jonker
S. Premachandra
S. G. Crowder
V. Dergachev
Y. Zhang
Publication venue: 'American Physical Society (APS)'
Publication date
Field of study

Investigating the Correlation between Performance Scores and Energy Consumption of Mobile Web Apps

Author: Bonferroni C.
Choudhary S. R.
Correlation Coefficient Spearman Rank
Cruz L.
Gottschalk M.
Grissom R. J.
Joorabchi M. E.
Mahajan S.
Malavolta I.
Malavolta I.
Nejati J.
Nucci D. Di
Ocariza F. S.
Palomba F.
Thiagarajan N.
Vegas S.
Vesuna J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2020
Field of study

Context. Developers have access to tools like Google Lighthouse to assess the performance of web apps and to guide the adoption of development best practices. However, when it comes to energy consumption of mobile web apps, these tools seem to be lacking. Goal. This study investigates on the correlation between the performance scores produced by Lighthouse and the energy consumption of mobile web apps. Method. We design and conduct an empirical experiment where 21 real mobile web apps are (i) analyzed via the Lighthouse performance analysis tool and (ii) measured on an Android device running a software-based energy profiler. Then, we statistically assess how energy consumption correlates with the obtained performance scores and carry out an effect size estimation. Results. We discover a statistically significant negative correlation between performance scores and the energy consumption of mobile web apps (with medium to large effect sizes), implying that an increase of the performance score tend to lead to a decrease of energy consumption. Conclusions. We recommend developers to strive to improve the performance level of their mobile web apps, as this can also have a positive impact on their energy consumption on Android devices

VU Research Portal

Semi-supervised discovery of differential genes

Author: A Lewin
B Efron
C Furlanello
CE Bonferroni
D Singh
E Bair
E Wit
J Neyman
J Storey
J Storey
J Weston
JD Storey
JT Leek
K Najarian
KB Duan
M Bhattacharjee
M Seeger
N Dean
OG Troyanskaya
P Broberg
R Gottardo
R Tibshirani
Shigeyuki Oba
Shin lshii
TR Golub
U Alon
VG Tusher
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Various statistical scores have been proposed for evaluating the significance of genes that may exhibit differential expression between two or more controlled conditions. However, in many clinical studies to detect clinical marker genes for example, the conditions have not necessarily been controlled well, thus condition labels are sometimes hard to obtain due to physical, financial, and time costs. In such a situation, we can consider an unsupervised case where labels are not available or a semi-supervised case where labels are available for a part of the whole sample set, rather than a well-studied supervised case where all samples have their labels. RESULTS: We assume a latent variable model for the expression of active genes and apply the optimal discovery procedure (ODP) proposed by Storey (2005) to the model. Our latent variable model allows gene significance scores to be applied to unsupervised and semi-supervised cases. The ODP framework improves detectability by sharing the estimated parameters of null and alternative models of multiple tests over multiple genes. A theoretical consideration leads to two different interpretations of the latent variable, i.e., it only implicitly affects the alternative model through the model parameters, or it is explicitly included in the alternative model, so that the interpretations correspond to two different implementations of ODP. By comparing the two implementations through experiments with simulation data, we have found that sharing the latent variable estimation is effective for increasing the detectability of truly active genes. We also show that the unsupervised and semi-supervised rating of genes, which takes into account the samples without condition labels, can improve detection of active genes in real gene discovery problems. CONCLUSION: The experimental results indicate that the ODP framework is effective for hypotheses including latent variables and is further improved by sharing the estimations of hidden variables over multiple tests

Springer - Publisher Connector

The Impacts of Reduced Access to Abortion and Family Planning Services: Evidence from Texas

Author: &apos
Alberto Abadie
Amanda J Stevenson
Amy M Branum
Carlo E Bonferroni
Corey White
Defendant-Appellant Appellee
Elizabeth Ananat
Fairmount Center
George A Akerlof
Heather Royer
James Trussell
Jeffrey M Wooldridge
Kari White
Marianne Bitler
Martha J Bailey
Planned Parenthood
Rebecca M Blank
Stefanie Fischer
Susan L Averett
Sylvain Weber
Thomas C Buchmueller
Zolna
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Public Library of Science (PLOS)

Phenotype Sequencing: Identifying the Genes That Cause a Phenotype Directly from Pooled Sequencing of Independent Mutants

Author: A Futschik
A Srivatsan
C Herring
C Honisch
C Lee
CE Bonferroni
Christopher J. Lee
D Lee
D Smith
E Jones
G Velicer
H Li
Iara M. P. Machado
J Barrick
J Barrick
J Cridland
J Klockgether
J Miller
J Ohnishi
James C. Liao
JL Cocchiaro
K Holt
K McKernan
Marc A. Harper
O Harismendy
P Chen
P Cock
Raphael Valdivia
S Atsumi
S Atsumi
S Atsumi
S Le Crom
Stanley F. Nelson
T Conrad
T Hanai
Traci Toy
Zugen Chen
Publication venue: Public Library of Science
Publication date: 18/02/2011
Field of study

Random mutagenesis and phenotype screening provide a powerful method for dissecting microbial functions, but their results can be laborious to analyze experimentally. Each mutant strain may contain 50–100 random mutations, necessitating extensive functional experiments to determine which one causes the selected phenotype. To solve this problem, we propose a “Phenotype Sequencing” approach in which genes causing the phenotype can be identified directly from sequencing of multiple independent mutants. We developed a new computational analysis method showing that 1. causal genes can be identified with high probability from even a modest number of mutant genomes; 2. costs can be cut many-fold compared with a conventional genome sequencing approach via an optimized strategy of library-pooling (multiple strains per library) and tag-pooling (multiple tagged libraries per sequencing lane). We have performed extensive validation experiments on a set of E. coli mutants with increased isobutanol biofuel tolerance. We generated a range of sequencing experiments varying from 3 to 32 mutant strains, with pooling on 1 to 3 sequencing lanes. Our statistical analysis of these data (4099 mutations from 32 mutant genomes) successfully identified 3 genes (acrB, marC, acrA) that have been independently validated as causing this experimental phenotype. It must be emphasized that our approach reduces mutant sequencing costs enormously. Whereas a conventional genome sequencing experiment would have cost

7,200 in reagents alone, our Phenotype Sequencing design yielded the same information value for only

1200. In fact, our smallest experiments reliably identified acrB and marC at a cost of only

110–

340

Public Library of Science (PLOS)

Structural Relationships between Highly Conserved Elements and Genes in Vertebrate Genomes

Author: A Derti
A Sandelin
A Siepel
A Woolfe
C Bonferroni
CE Bishop
CJ Lin
D Boffelli
D Martin
ET Dermitzakis
G Bejerano
G Bourque
Geir Skogerbø
GK McEwen
H Kikuta
H Sun
Hong Sun
IM Meyer
JA Bailey
Jason E. Stajich
JH Postlethwait
KP O'Brien
LA Lettice
LA Pennacchio
M Kikuchi
MA Nobrega
N Ahituv
N Ahituv
NJ Bachman
O Dubourg
R Ihaka
R Varon
S Jhunjhunwala
S Stephen
SY Kim
T Nagase
T Vavouri
T Williams
V Veeramachaneni
Wei Liu
WG Fairbrother
WJ Kent
WP Dirksen
Y Benjamini
Yixue Li
Zhen Wang
Publication venue: Public Library of Science
Publication date: 14/11/2008
Field of study

Large numbers of sequence elements have been identified to be highly conserved among vertebrate genomes. These highly conserved elements (HCEs) are often located in or around genes that are involved in transcription regulation and early development. They have been shown to be involved in cis-regulatory activities through both in vivo and additional computational studies. We have investigated the structural relationships between such elements and genes in six vertebrate genomes human, mouse, rat, chicken, zebrafish and tetraodon and detected several thousand cases of conserved HCE-gene associations, and also cases of HCEs with no common target genes. A few examples underscore the potential significance of our findings about several individual genes. We found that the conserved association between HCE/HCEs and gene/genes are not restricted to elements by their absolute distance on the genome. Notably, long-range associations were identified and the molecular functions of the associated genes do not show any particular overrepresentation of the functional categories previously reported. HCEs in close proximity are found to be linked with different set of gene/genes. The results reflect the highly complex correlation between HCEs and their putative target genes

Allele Frequency Estimation from Ambiguous Data: Using Resampling Schema in Validating Frequency Estimates and in Selective Neutrality Testing

Author: Alicia Sanchez-Mazas
Bonferroni C.
Dempster A.
Efron B.
Excoffier L.
Jean-Marie Tiercy
José Manuel Nunes
Lancaster A.
Maria Eugenia Riccio
Nunes J. M.
Nunes J. M.
Nunes J. M.
Riccio M. E.
Scott I.
Watterson G. A.
Publication venue: 'Human Biology (The International Journal of Population Biology and Genetics)'
Publication date
Field of study

Public Library of Science (PLOS)

Transient exposure to low levels of insecticide affects metabolic networks of honeybee larvae

Author: A Raj
AA Teleman
AL Toth
Alessandro Guffanti
Anna Moles
B Chen
B Wang
Baohong Zhang
C Bonferroni
C Claudianos
C Frei
C Grandori
C Voellenkle
Catharine A. Ortori
CH Waddington
Charles Snart
CW Schneider
DA Guertin
David A. Barrett
DF Jarosz
Diane P. Genereux
E Hornstein
EC Yang
EC Yang
EG Bligh
Eugene Schuster
F Li
G Vansant
HY Tang
I Laycock
I Zinke
J Staples
J Varghese
JE Baenziger
JM Tennessen
JR Misra
K King-Jones
K Matsuda
Kamila Derecka
L Li
L Palanker
L Wang
LA Johnston
M Bujold
M Giraudo
M Kanehisa
M Tomizawa
MA Horner
Martin J. Blythe
MC Frith
MD Robinson
MH Sieber
MS Dionne
N Alic
N Shomron
P Daran-Lapujade
P Flicek
P Jeschke
P Xu
Paolo Pavan
PJ Daborn
PJ Kersey
R Delanoue
Reinhard Stöger
S Anders
S Shah
S Wullschleger
SL Rutherford
Sunir Malla
T Blacquiere
T Kunieda
TA Sangster
THGS Consortium
Thomas Ryder
TR Li
V Hilgers
V Sollars
V Specchia
WF Eanes
WM Wheeler
X Li
Y Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

The survival of a species depends on its capacity to adjust to changing environmental conditions, and new stressors. Such new, anthropogenic stressors include the neonicotinoid class of crop-protecting agents, which have been implicated in the population declines of pollinating insects, including honeybees (Apis mellifera). The low-dose effects of these compounds on larval development and physiological responses have remained largely unknown. Over a period of 15 days, we provided syrup tainted with low levels (2 µg/L−1) of the neonicotinoid insecticide imidacloprid to beehives located in the field. We measured transcript levels by RNA sequencing and established lipid profiles using liquid chromatography coupled with mass spectrometry from worker-bee larvae of imidacloprid-exposed (IE) and unexposed, control (C) hives. Within a catalogue of 300 differentially expressed transcripts in larvae from IE hives, we detect significant enrichment of genes functioning in lipid-carbohydrate-mitochondrial metabolic networks. Myc-involved transcriptional response to exposure of this neonicotinoid is indicated by overrepresentation of E-box elements in the promoter regions of genes with altered expression. RNA levels for a cluster of genes encoding detoxifying P450 enzymes are elevated, with coordinated downregulation of genes in glycolytic and sugar-metabolising pathways. Expression of the environmentally responsive Hsp90 gene is also reduced, suggesting diminished buffering and stability of the developmental program. The multifaceted, physiological response described here may be of importance to our general understanding of pollinator health. Muscles, for instance, work at high glycolytic rates and flight performance could be impacted should low levels of this evolutionarily novel stressor likewise induce downregulation of energy metabolising genes in adult pollinators

Nottingham ePrints

CiteSeerX

Nottingham eTheses

Repository@Nottingham